affine hull
Compositional Risk Minimization
Mahajan, Divyat, Pezeshki, Mohammad, Mitliagkas, Ioannis, Ahuja, Kartik, Vincent, Pascal
In this work, we tackle a challenging and extreme form of subpopulation shift, which is termed compositional shift. Under compositional shifts, some combinations of attributes are totally absent from the training distribution but present in the test distribution. We model the data with flexible additive energy distributions, where each energy term represents an attribute, and derive a simple alternative to empirical risk minimization termed compositional risk minimization (CRM). We first train an additive energy classifier to predict the multiple attributes and then adjust this classifier to tackle compositional shifts. We provide an extensive theoretical analysis of CRM, where we show that our proposal extrapolates to special affine hulls of seen attribute combinations. Empirical evaluations on benchmark datasets confirms the improved robustness of CRM compared to other methods from the literature designed to tackle various forms of subpopulation shifts.
On Mitigating the Utility-Loss in Differentially Private Learning: A new Perspective by a Geometrically Inspired Kernel Approach
Kumar, Mohit, Moser, Bernhard A., Fischer, Lukas
Privacy-utility tradeoff remains as one of the fundamental issues of differentially private machine learning. This paper introduces a geometrically inspired kernel-based approach to mitigate the accuracy-loss issue in classification. In this approach, a representation of the affine hull of given data points is learned in Reproducing Kernel Hilbert Spaces (RKHS). This leads to a novel distance measure that hides privacy-sensitive information about individual data points and improves the privacy-utility tradeoff via significantly reducing the risk of membership inference attacks. The effectiveness of the approach is demonstrated through experiments on MNIST dataset, Freiburg groceries dataset, and a real biomedical dataset. It is verified that the approach remains computationally practical. The application of the approach to federated learning is considered and it is observed that the accuracy-loss due to data being distributed is either marginal or not significantly high.
Geometric Insights into Support Vector Machine Behavior using the KKT Conditions
Carmichael, Iain, Marron, J. S.
The Support Vector Machine (SVM) is a powerful and widely used classification algorithm. Its performance is well known to be impacted by a tuning parameter which is frequently selected by cross-validation. This paper uses the Karush-Kuhn-Tucker conditions to provide rigorous mathematical proof for new insights into the behavior of SVM in the large and small tuning parameter regimes. These insights provide perhaps unexpected relationships between SVM and naive Bayes and maximal data piling directions. We explore how characteristics of the training data affect the behavior of SVM in many cases including: balanced vs. unbalanced classes, low vs. high dimension, separable vs. non-separable data. These results present a simple explanation of SVM's behavior as a function of the tuning parameter. We also elaborate on the geometry of complete data piling directions in high dimensional space. The results proved in this paper suggest important implications for tuning SVM with cross-validation.
Convex Geometry of the Generalized Matrix-Fractional Function
Burke, James V., Gao, Yuan, Hoheisel, Tim
Generalized matrix-fractional (GMF) functions are a class of matrix support functions introduced by Burke and Hoheisel as a tool for unifying a range of seemingly divergent matrix optimization problems associated with inverse problems, regularization and learning. In this paper we dramatically simplify the support function representation for GMF functions as well as the representation of their subdifferentials. These new representations allow the ready computation of a range of important related geometric objects whose formulations were previously unavailable.
Analytic Feature Selection for Support Vector Machines
Stambaugh, Carly, Yang, Hui, Breuer, Felix
Support vector machines (SVMs) rely on the inherent geometry of a data set to classify training data. Because of this, we believe SVMs are an excellent candidate to guide the development of an analytic feature selection algorithm, as opposed to the more commonly used heuristic methods. We propose a filter-based feature selection algorithm based on the inherent geometry of a feature set. Through observation, we identified six geometric properties that differ between optimal and suboptimal feature sets, and have statistically significant correlations to classifier performance. Our algorithm is based on logistic and linear regression models using these six geometric properties as predictor variables. The proposed algorithm achieves excellent results on high dimensional text data sets, with features that can be organized into a handful of feature types; for example, unigrams, bigrams or semantic structural features. We believe this algorithm is a novel and effective approach to solving the feature selection problem for linear SVMs.